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ABSTRACT 

MicroRNAs (miRNAs) are small non-coding RNA mol- 
ecules capable of negatively regulating gene expres- 
sion to control many cellular mechanisms. The 
miRTarBase database (http://mirtarbase.mbc.nctu. 
edu.tw/) provides the most current and comprehen- 
sive information of experimentally validated miRNA- 
target interactions. The database was launched in 
2010 with data sources for >100 published studies 
in the identification of miRNA targets, molecular 
networks of miRNA targets and systems biology, 
and the current release (2013, version 4) includes sig- 
nificant expansions and enhancements over the initial 
release (2010, version 1). This article reports the 
current status of and recent improvements to the 
database, including (i) a 14-fold increase to miRNA- 
target interaction entries, (ii) a miRNA-target network, 

(iii) expression profile of miRNA and its target gene, 

(iv) miRNA target-associated diseases and (v) add- 
itional utilities including an upgrade reminder and an 
error reporting/user feedback system. 



INTRODUCTION 

MicroRNAs (miRNAs) are non-coding RNAs ~ 19-25 nt 
in length, which are widely found in organisms such as 
plants, nematodes, fruit flies and mammals (1). lin-4 was 
the first identified miRNA from Caenorhabditis elegans 
and was found to control the timing and progression of 
the nematode life cycle (2). In humans, miRNAs play im- 
portant roles in cellular physiology, development and 
disease by negatively regulating gene expression (3). 
miRNAs bind to complementary sequences in the 3' un- 
translated regions of their target niRNAs and induce 
mRNA degradation or translation repression (4). 

miRNAs play important roles in causing many diseases 
including various types of cancer, cardiovascular diseases 
and neurological disorders (5-8). Thus, as shown in 
Supplementary Figure SI, miRNA and the field of non- 
coding RNA have attracted increasing research interest. 
In recent years, many databases related to miRNAs have 
been developed, providing information about miRNAs 
and their target genes. miRBase (9) is the central reposi- 
tory for miRNAs nomenclature, sequence data, annota- 
tion and target prediction, containing ~24 521 miRNAs 
entries. Many databases such as microRNA.org (10), 
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miRGen (11), miRGator (12), miRDB (13) and 
miRNAMap (14) identify miRNA target interactions 
(MTIs) using a number of target prediction tools like 
TargetScan (15,16), miRanda (17), PicTar (18), 
RNAhybrid (19) and PITA (20). Furthermore, some data- 
bases provide evidence for experimentally validated 
miRNAs and their target genes. TarBase (21) contains a 
manually curated collection of experimentally tested 
miRNA targets, in humans, mice, fruit flies, worms and 
zebrafish. miRecords (22), an integrated resource for 
animal miRNA-target interaction, hosts a large, high- 
quahty manually curated database of experimentally 
validated miRNA-target interactions with systematic 
experimental documentation for each interaction. The 
HMDD (23) database is the first resource to provide ex- 
perimentally supported human miRNA and disease asso- 
ciations. miR2Disease (24) provides a brief description of 
the miRNA-disease relationship along with information 
about miRNA expression patterns as well as experimen- 
tally verified miRNA target genes and literature refer- 
ences. DIANA-LncBase (25) provides comprehensive 
annotations of miRNA targets on long non-coding 
RNAs with transcriptome-wide experimentally verified 
and computationally predicted miRNA recognition 
elements. miRSel (26), a miRNA-gene association 
database, combines text-mining results with existing data- 
bases and computational predictions. miRWalk (27) 
presents predicted and vahdated information on 
miRNA-target interactions and enables researches to 
validate new targets of miRNA not only on 3' untrans- 
lated regions but also on other regions of all known genes. 
By including experimental evidence, these research 
resources are highly effective in identifying MTIs. 

Before proceeding with experimental validation, a 
number of computational programs are used to predict a 
putative miRNA binding site within a given mRNA 
target. Once the predicted miRNA binding sites have 
been determined, these MTIs are then validated by mo- 
lecular experiments, including reporter assays and western 
blots, which are the conventional methods for confirming 
miRNA and target gene interactions. The rationale for 
using the reporter assay is that the binding of a given 
miRNA to its specific mRNA target site will repress 
reporter protein production thereby reducing expression, 
so that the inhibited level can be easily compared with 
control. Experiments hke northern blot analysis or quan- 
titative real-time PGR (qPCR) use total RNA isolated 
from a specific cell type and examine the coexpression of 
miRNA and mRNA. One typical approach to vahdate the 
functional importance of a niiRNA/mRNA target pair is 
the transient overexpression of a given miRNA mimic in a 
cell type known to repress the putative target protein 
followed by western blot analysis (28). Recently, 
genome-wide screening experiments have been developed 
including niicroarrays with overexpression or the 
knockdown of miRNAs, stable isotope labehng with 
amino acids in culture (SILAC) or pulsed SILAC 
[pSILAC (29)]. 

The identification of the roles of miRNAs and their 
targets in different biological systems raise the need to 
easily access and frequently update central information 



repositories. miRTarBase serves as an important reposi- 
tory for experimentally validated MTIs, which are fre- 
quently updated by manually surveying research articles. 
In addition, miRTarBase contains the largest number of 
validated MTIs with strong evidentiary support, and the 
collection is more frequently updated than other databases 
such as TarBase, miRecords and miR2Disease. Table 1 
summarizes features added in the latest update. 

IMPROVEMENTS 

Table 1 lists the advancements and new features supported 
in the 2014 miRTarBase update. Major improvements 
include (i) a 14-fold increase in miRNA-target interaction 
entries as compared with the initial release, (h) a miRNA- 
target network, (in) expression profile of miRNA and its 
target gene, (iv) miRNA target-associated diseases and (v) 
additional uses including an upgrade reminder service and 
an error reporting/user feedback system. 

Updated database content 

In the 2014 update, 51460 curated MTIs between 1232 
miRNAs and 17 520 target genes were collected from 
2636 articles. Table 2 lists the number of collected MTIs 
in each species. In all, 38 113 human MTIs were collected 
between 587 miRNAs and 12 194 target genes with experi- 
mental support from 2143 articles; in addition, 1778 and 
3026 interactions were, respectively, confirmed by western 
blot and reporter assays. Each research article was care- 
fully reviewed by at least two of our developers to extract 
the MTIs, which were experimentally confirmed by 
reporter assay, western blot, qPCR, niicroarray, 
pSILAC or NGS (CLIP-seq or Degradome-seq). The 
2014 update included a large increase in the number of 
MTIs supported by strong experimental evidence (as 
validated by reporter assay, western blot or qPCR; 
Figure 1). 

Experimental validation method — addition of the 'NGS' 
support type 

Experimental approaches for identifying MTIs (e.g. the 
reporter assay) are time-consuming and incapable of 
handling large-scale screenings. Recent studies have 
demonstrated that MTIs can be uncovered via high- 
throughput screening using next-generation sequencing 
(NGS) technology. Ultraviolet cross-hnking and 
imniunoprecipitation (CLIP) was first developed to 
identify specific Nova RNA-protein complexes in mouse 
brain tissue (30). Chi et al. (31) pioneered the use of cross- 
hnking and immunoprecipitation approach, combined 
with NGS technologies (CLIP-seq or HITS-CLIP) to 
identify MTIs to obtain Argonaure-miRNA-protein 
complexes in mouse brain tissue. Furthermore, Hafner 
et al. (32) modified the CLIP-seq method [as the 
photoactivatable-ribonucleoside-enhanced cross-hnking 
and immunoprecipitation (PAR-CLIP)] to enhance the 
Argonaure-miRNA-protein complex resolution of the 
original CLIP-seq method. German et al. (33) also de- 
veloped an NGS approach to detect MTIs by identifying 
mRNA cleavage products through parallel analysis of 
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Table 1. The comparison of data and function between miRTarBase version 1.0 and miRTarBase version 4.0 
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Table 2. Tlie statistics of miRTarBase entries 



Species 



No. of 

miRNA-target 
interactions 



No. of 
miRNAs 



No. of 

target 

genes 



No. of 
articles 
collected" 



No. of miRNA-target interactions 
experimentally validated by 



Strong 
evidence 



Less strong 
evidence 



Western Reporter pSILAC Microarray NGS 
blot assay 



Human 


38 113 


587 


12194 


2143 


1778 


3026 


495 


11704 


20492 


Mouse 
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Zebrafish 


114 


30 


82 


40 


36 
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129 


39 


77 


37 
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12 
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92 
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17 
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13 


20 
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7 


15 


5 


16 


1 


7 


0 


0 


0 


Total 


51460 


1232 


17 520 


2636 


2407 


4118 


497 


12 547 


31 737 



"Articles may report various miRNA-target interactions in different species. 



RNA ends (PARE), also known as degradome-seq. 
Because the RNA-induced silencing complex (RISC)- 
mediated cleavage is not the major mechanism of 
miRNA regulation in mammals, this approach is mainly 
used in plants such as Arabidopsis (33,34) and rice (35). In 
addition, only limited records are available for mammals 
(36-38). Several databases have been developed to compile 
publicly available CLIP-seq, PAR-CLIP and degradome- 
seq data sets for analysis, such as CLIPZ (39), starBase 
(40), doRiNA (41) and TarBase 6.0. In contrast, CLASH 



(cross hnking, ligation and sequencing of hybrids) was 
recently developed to directly map the miRNA-niRNA 
binding sites without using the target prediction (42,43). 

The database was populated with entries derived from 
manually curated articles. The curators noted the miRNA, 
the related target gene and information regarding the ex- 
periment such as the cell line or tissue used. Besides MTIs 
included in the initial miRTarBase release and validated 
using reporter assays, western blots, qPCR, microarrays 
and pSILAC, the updated version supports NGS data 
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Figure 1. Number of miRNA-target interactions with 'strong experi- 
mental evidence' in TarBase version 6, miRTarBase version 1 and 
miRTarBase version 4. 



(NGS: CLlP-seq, Degradome-seq and CLASH-seq). Here 
we incorporate the 21 human CLIP-seq data set, 5 mouse 
CLIP-seq data set, 6 nematode data set and one human 
CLASH-seq data set into miRTarBase. CLASH-seq data 
set provides tlie large number of miRNA binding sites 
including canonical and non-canonical miRNA-target 
interactions, which make the information of miRNA- 
target interactions more complete. 

miRNA target-associated diseases 

miRNA-related dysfunctions are associated with a broad 
spectrum of diseases, including various types of cancer, car- 
diovascular diseases and neurological disorders, and 
miRNAs have emerged as a novel class of potential bio- 
markers or targets for disease diagnosis and therapy. To 
provide more information about miRNA-associated 
diseases and the relationship between miRNA-target inter- 
actions and disease, the data contents of HMDD version 
2.0 and miR2Disease are integrated in miRTarBase. In 
addition to the integration of external disease databases, 
experimentally verified miRNA-associated diseases are 
manually curated through literature surveys. 

miRNA-target network 

Interactions between a given MTI and other miRNAs/ 
niRNAs in the miRTarBase can be visualized through an 
interactive network web interface by integrating 
CytoscapeWeb (44). This network visualization can help 
researchers understand complicated miRNA-target regula- 
tion. For example, given an MTI (hsa-niiR-26a-5p and 
EZH2), Figure 2 shows a miRNA-target regulatory 
network consisting of the first-order neighbors of this 
MTI. This network visualization clearly demonstrates that 
miR-26a and miR-217 regulate EZH2 and PTEN, which is 
a complicated regulation. Furtheiinore, we examined the 
functions of these target genes involved in miRNA-target 
interactions collected in the database by performing Gene 
Ontology (GO) and KEGG pathway enrichment annota- 
tion using the DAVID gene annotation tool (45). 

Expression profile of miRNA and its target gene 

Many previous studies have integrated miRNA and 
niRNA expression profiles to predict MTIs with lower 



false positives and more biologically meaningful targets 
(12,46-50). The correlation between miRNA and mRNA 
provides an important indication of miRNA direct 
targets. In addition, the miRNA and mRNA expression 
data provides dynamic information for MTIs, which can 
help biologists investigating phenotype- specific miRNA 
regulatory pathways. Thus, we collected human miRNA 
and mRNA matched expression data to provide pheno- 
type-specific miRNA-mRNA correlation analysis. 

Once a user finds an interesting MTI in miRTarBase, 
the next step is to determine what phenotype condition 
can make activate the MTI. Some MTIs are active in 
various conditions but others are only active in a specific 
phenotype. To address this issue, we provided phenotype- 
specific MTI coexpression profiles for many data sets with 
various phenotypes. We selected 21 human data sets from 
the NCBl GEO database with at least 9 matched mRNA 
and miRNA samples (51), producing 1596 samples. These 
data sets were originally generated to study both mRNA 
and miRNA profiles under various and complex pheno- 
types, e.g. niRNA and miRNA expression profiles of the 
renal cortex for hypertensive and normotensive patients 
(GSE28260); and mRNA and miRNA expression 
profiles of breast cancer cells treated with the Novel 
Histone Deacetylase Inhibitor CG-1521 (GSE25844). 
Supplementary Table SI shows the data set title, 
platform and the number of miRNA/niRNA samples. 
As GSE 19783 has 216 samples including the estrogen 
receptor negative (ER— ) and estrogen receptor positive 
(ER+) samples, we divided GSE19783 into two separate 
data sets: GSE19783 ER- and GSE19783 ER+. 

We integrated human gene expression data from 
multiple platforms including 11 mRNA and 8 miRNA 
platforms. In the mRNA expression data, all data sets 
were log-transformed and median-centered per sample, 
and standard deviations were normalized to one per 
sample. In addition, if one gene has several probes, we 
calculate the mean expression value for the gene. For 
each miRNA and mRNA pair, we calculated the 
Pearson correlation between miRNA and mRNA expres- 
sion profiles for each data set. To estimate the significance 
of the MTI using expression profiles, we transferred the 
Pearson correlation to a /"-value for each MTI as follows. 
Given an MTI with a Pearson correlation r and a sample 
size «, we calculated the lvalue (52) as follows: 



The /"-value was calculated using the ?-value and 
t-distribution with a degree of freedom of n — 2. 

Given a human MTI, miRTarBase will show the expres- 
sion table with a data set-specific correlation and /"-value 
for all expression data sets (Figure 3A), which indicates 
how hkely the MTI is to be active in a certain phenotype. 
Furthermore, when the user selects a data set, 
miRTarBase will draw the expression profiles of miRNA 
and mRNA of the MTI (Figure 3B). To facilitate the com- 
parison of miRNA and mRNA profiles, we normalized 
both miRNA and mRNA profiles to a mean of 0 and a 
standard deviation of 1. 
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Figure 2. miRNA-target regulatory network. Given an MTI hsa-miR-26a-5p and EZH2, miRTarBase shows a miRNA-target regulatory network 
consisting of all the target genes of hsa-miR-26a-5p and all the miRNAs that target EZH2. 



To demonstrate the phenotype-specific correlation 
analysis, Figure 3 shows miRNA/mRNA expression 
profiles for miR-26a and EZH2. Three data sets have sig- 
nificantly negative correlations between miR-26a and 
EZH2 (P< 0.001), and the top two data sets are breast 
cancers (Figure 3A). Interestingly, Zhang et al. reported 
that miR-26a is a tumor suppressor and targets EZH2 in 
breast cancer (53), which is consistent with our results. 

ENHANCED INTERFACE 

To facilitate access to data and further analyses to support 
research on miRNA-target interactions, various query 
interfaces and graphical visualization pages were re- 
designed and enhanced. The miRTarBase provides two 
modes for querying specific MTI's information — the 
species browser and search utihty. Users can browse 
through numerous miRNA targets and explore the 
outcomes of hundreds of experimental studies in a way 
which is both simple and intuitive. The user can perform 
basic MTI searches by miRNA, target gene symbol, val- 
idation method or PubMed ID. The search box suggests 
plausible keywords and supports autocomplete mode as 
users type in the search field. Each MTI output page 
consists of basic information including miRNA and 
target gene information, miRNA disease, experimental 
evidence support with hterature references, miRNA- 
target expression profiles and miRNA-target regulatory 
network. To visualize the interactions between MTI and 
other miRNAs/mRNAs, we support the interactive 
network web interface to present a regulatory network 
consisting of the MTFs first-order neighbors. In 



addition, the upgrade integrates an upgrade reminder, 
and error reporting/user feedback mechanism. User 
feedback is extremely valuable to further improve the 
database as a comprehensive and user-friendly tool for 
exploring miRNA-target interactions. 



CONCLUSIONS AND PERSPECTIVES 

The current update represents a 10-fold increase of MTIs 
as compared to the initial miRTarBase release, and 
includes a significant extension of specific research- 
oriented features. miRTarBase 4 is designed to serve as 
a multifaceted tool for providing extensive experimental 
support in all miRNA-related research. Moving forward, 
the database will be updated at 2 month intervals to 
capture the growing number of publications covering 
novel targets. To summarize complicated miRNA-target 
regulations, we provide a visualization of the miRNA- 
target regulatory network consisting of the MTFs first- 
order neighbors. To support dynamic MTI information, 
we collected human miRNA/mRNA matched expression 
data sets to provide phenotype-specific correlations 
between miRNA and mRNA expression profiles. 

The latest release incorporates human miRNA target 
expression profiles from 21 GEO data sets, which were 
all done using microarray platforms. Future work will 
extend our functional and expression analyses to other 
species as miRNA-target expression profiles of these 
species become available, so that miRTarBase will even- 
tually provide sufficient information to support any 
miRNA-related work. 
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Figure 3. Plienotype-specific correlation analysis. (A) Given an MTI hsa-miR-26a-5p and EZH2, miRTarBase shows the Pearson correlation, 
P-value and miRNA/mRNA expression profile for each data set. This table shows the top 4 data sets with significant /'-value. (B) When the 
user selects a data set, miRTarBase shows the detailed profiles for hsa-miR-26a-5p and EZH2 in this data set. 



AVAILABILITY 

The miRTarBase content will be continuously maintained 
anci updated every 2 months. The database is now publicly 
accessible at http://miRTarBase.mbc.iictu.edu.tw/. 



SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Onhne, 
including [54-69]. 
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